Search Results

Documents authored by Martin, Daniel P.


Document
The Dynamic k-Mismatch Problem

Authors: Raphaël Clifford, Paweł Gawrychowski, Tomasz Kociumaka, Daniel P. Martin, and Przemysław Uznański

Published in: LIPIcs, Volume 223, 33rd Annual Symposium on Combinatorial Pattern Matching (CPM 2022)


Abstract
The text-to-pattern Hamming distances problem asks to compute the Hamming distances between a given pattern of length m and all length-m substrings of a given text of length n ≥ m. We focus on the well-studied k-mismatch version of the problem, where a distance needs to be returned only if it does not exceed a threshold k. Moreover, we assume n ≤ 2m (in general, one can partition the text into overlapping blocks). In this work, we develop data structures for the dynamic version of the k-mismatch problem supporting two operations: An update performs a single-letter substitution in the pattern or the text, whereas a query, given an index i, returns the Hamming distance between the pattern and the text substring starting at position i, or reports that the distance exceeds k. First, we describe a simple data structure with 𝒪̃(1) update time and 𝒪̃(k) query time. Through considerably more sophisticated techniques, we show that 𝒪̃(k) update time and 𝒪̃(1) query time is also achievable. These two solutions likely provide an essentially optimal trade-off for the dynamic k-mismatch problem with m^{Ω(1)} ≤ k ≤ √m: we prove that, in that case, conditioned on the 3SUM conjecture, one cannot simultaneously achieve k^{1-Ω(1)} time for all operations (updates and queries) after n^{𝒪(1)}-time initialization. For k ≥ √m, the same lower bound excludes achieving m^{1/2-Ω(1)} time per operation. This is known to be essentially tight for constant-sized alphabets: already Clifford et al. (STACS 2018) achieved 𝒪̃(√m) time per operation in that case, but their solution for large alphabets costs 𝒪̃(m^{3/4}) time per operation. We improve and extend the latter result by developing a trade-off algorithm that, given a parameter 1 ≤ x ≤ k, achieves update time 𝒪̃(m/k +√{mk/x}) and query time 𝒪̃(x). In particular, for k ≥ √m, an appropriate choice of x yields 𝒪̃(∛{mk}) time per operation, which is 𝒪̃(m^{2/3}) when only the trivial threshold k = m is provided.

Cite as

Raphaël Clifford, Paweł Gawrychowski, Tomasz Kociumaka, Daniel P. Martin, and Przemysław Uznański. The Dynamic k-Mismatch Problem. In 33rd Annual Symposium on Combinatorial Pattern Matching (CPM 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 223, pp. 18:1-18:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)


Copy BibTex To Clipboard

@InProceedings{clifford_et_al:LIPIcs.CPM.2022.18,
  author =	{Clifford, Rapha\"{e}l and Gawrychowski, Pawe{\l} and Kociumaka, Tomasz and Martin, Daniel P. and Uzna\'{n}ski, Przemys{\l}aw},
  title =	{{The Dynamic k-Mismatch Problem}},
  booktitle =	{33rd Annual Symposium on Combinatorial Pattern Matching (CPM 2022)},
  pages =	{18:1--18:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-234-1},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{223},
  editor =	{Bannai, Hideo and Holub, Jan},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2022.18},
  URN =		{urn:nbn:de:0030-drops-161454},
  doi =		{10.4230/LIPIcs.CPM.2022.18},
  annote =	{Keywords: Pattern matching, Hamming distance, dynamic algorithms}
}
Document
RLE Edit Distance in Near Optimal Time

Authors: Raphaël Clifford, Paweł Gawrychowski, Tomasz Kociumaka, Daniel P. Martin, and Przemysław Uznański

Published in: LIPIcs, Volume 138, 44th International Symposium on Mathematical Foundations of Computer Science (MFCS 2019)


Abstract
We show that the edit distance between two run-length encoded strings of compressed lengths m and n respectively, can be computed in O(mn log(mn)) time. This improves the previous record by a factor of O(n/log(mn)). The running time of our algorithm is within subpolynomial factors of being optimal, subject to the standard SETH-hardness assumption. This effectively closes a line of algorithmic research first started in 1993.

Cite as

Raphaël Clifford, Paweł Gawrychowski, Tomasz Kociumaka, Daniel P. Martin, and Przemysław Uznański. RLE Edit Distance in Near Optimal Time. In 44th International Symposium on Mathematical Foundations of Computer Science (MFCS 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 138, pp. 66:1-66:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)


Copy BibTex To Clipboard

@InProceedings{clifford_et_al:LIPIcs.MFCS.2019.66,
  author =	{Clifford, Rapha\"{e}l and Gawrychowski, Pawe{\l} and Kociumaka, Tomasz and Martin, Daniel P. and Uzna\'{n}ski, Przemys{\l}aw},
  title =	{{RLE Edit Distance in Near Optimal Time}},
  booktitle =	{44th International Symposium on Mathematical Foundations of Computer Science (MFCS 2019)},
  pages =	{66:1--66:13},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-117-7},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{138},
  editor =	{Rossmanith, Peter and Heggernes, Pinar and Katoen, Joost-Pieter},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.MFCS.2019.66},
  URN =		{urn:nbn:de:0030-drops-110109},
  doi =		{10.4230/LIPIcs.MFCS.2019.66},
  annote =	{Keywords: String algorithms, Compression, Pattern matching, Run-length encoding}
}
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail